An efficient approach for mining web content sensitivity

نویسندگان

  • Cheng Wang
  • Ying Liu
  • Liheng Jian
  • Peng Zhang
چکیده

Abnormal remarks on the web, such as violence, threat, superstition, etc., may disturb the social order and public morality (referred as sensitive content). To provide a quantitative measure of the sensitivity of a webpage, we propose the concept of web content sensitivity which measures how sensitive a page is. We also propose a web content sensitivity mining approach. Our experiment identified a number of sensitive webpages that traditional frequency-based methods failed to find. By varying the sensitive values of the keywords, different sets of high sensitivity keywords were discovered as well as the corresponding webpages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

Calculation of One-dimensional Forward Modelling of Helicopter-borne Electromagnetic Data and a Sensitivity Matrix Using Fast Hankel Transforms

The helicopter-borne electromagnetic (HEM) frequency-domain exploration method is an airborne electromagnetic (AEM) technique that is widely used for vast and rough areas for resistivity imaging. The vast amount of digitized data flowing from the HEM method requires an efficient and accurate inversion algorithm. Generally, the inverse modelling of HEM data in the first step requires a precise a...

متن کامل

A Method for Extracting Information from the Web Using Deep Learning Algorithm

Web mining related research are getting more important now a days because of the reason that large amount of data are managed through internet. The web usage is increasing in an uncontrolled manner. A specific system is needed for controlling such large amount of data in the web space. The web mining is classified into three major divisions that are web content mining, web usage mining and web ...

متن کامل

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Expert Discovery: A web mining approach

Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • I. J. Knowledge and Web Intelligence

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2009